Establishing a Formal Benchmarking Process for Sentiment Analysis for the Bangla Language
A K M Shahariar Azad Rabby, Aminul Islam, Fuad Rahman
Accepted to be presented at FTC 2020 - Future Technologies Conference 2020, 5-6 November 2020, Vancouver, Canada
Description
Tracking sentiments is a critical task in many natural language processing applications. A lot of work has been done on many leading languages in
the world, such as English. However, in many languages such as Bangla, sentiment analysis is still in early development. Most of the research on this topic
suffers from three key issues: (a) the lack of standardized publicly available datasets, (b) the subjectivity of the reported results, which generally manifests as a
lack of agreement on core sentiment categorizations, and finally, (c) the lack of
an established framework where these efforts can be compared to a formal
benchmark. Thus, this seems to be an opportune moment to establish a benchmark for sentiment analysis in Bangla. With that goal in mind, this paper presents benchmark results of ten different sentiment analysis solutions on three
publicly available Bangla sentiment analysis corpora. As part of the benchmarking process, we have optimized these algorithms for the task at hand. Finally,
we establish and present sixteen different evaluation matrices for benchmarking
these algorithms. We hope that this paper will jumpstart an open and transparent benchmarking process, one that we plan to update every two years, to help
validating newer and novel algorithms that will be reported in this area in future